1,559 research outputs found

    Evaluation of optimization techniques for aggregation

    Get PDF
    Aggregations are almost always done at the top of operator tree after all selections and joins in a SQL query. But actually they can be done before joins and make later joins much cheaper when used properly. Although some enumeration algorithms considering eager aggregation are proposed, no sufficient evaluations are available to guide the adoption of this technique in practice. And no evaluations are done for real data sets and real queries with estimated cardinalities. That means it is not known how eager aggregation performs in the real world. In this thesis, a new estimation method for group by and join combining traditional estimation method and index-based join sampling is proposed and evaluated. Two enumeration algorithms considering eager aggregation are implemented and compared in the context of estimated cardinality. We find that the new estimation method works well with little overhead and that under certain conditions, eager aggregation can dramatically accelerate queries

    Tutorial : The discrete-sectional method to simulate an evolving aerosol

    Get PDF
    The discrete-sectional method to solve the general dynamic equations is a useful tool for the simulation of an evolving aerosol population. This tutorial is intended to equip the reader with the necessary knowledge to implement this method for a single component system. To this end, we provide step-by-step instructions on the construction of a discrete-sectional model, including details on simulation bin configurations and all the necessary equations to describe relevant physical processes in an aerosol, i.e. condensation/evaporation, coagulation, and external particle losses. Supplementary to the text is a functional, open source MATLAB code that implements the framework introduced in this tutorial. The interested readers can use the code either for learning purposes or to meet research demands. Lastly, we designed six test cases not only to verify the validity of our discrete-sectional model, but also to help the reader gain insight into the evolution of aerosol systems.Peer reviewe

    Digital piracy, creative productivity, and customer care effort: evidence from the digital publishing industry

    Get PDF
    We empirically investigate how writers’ output is affected by copyright piracy using data from a Chinese digital publishing platform. We identify two measurements of writers’ output—creative productivity and customer care—which are also affected by readers’ feedback through purchasing, tipping, and commenting. We take advantage of an exogenous event—the termination of a free personal storage service and search function by a leading Chinese cloud storage provider in June 2016—to causally identify the effects of the resulting reduced copyright piracy on writers’ efforts. Using a difference-in-differences modeling approach, we compare the changes in average writer behavior before and after the event across two groups of writers: (1) writers who have profit-sharing contracts with the platform and (2) those who do not. We find that after the termination, contracted writers increased their creative productivity efforts in terms of quantity without sac-rificing quality but reduced their customer care efforts. However, these effects are absent for noncontracted writers. Our study is among the first to provide empirical support for the positive effect of digital intellectual property rights infringement re-duction on creative productivity

    Thick Cloud Removal of Remote Sensing Images Using Temporal Smoothness and Sparsity-Regularized Tensor Optimization

    Full text link
    In remote sensing images, the presence of thick cloud accompanying cloud shadow is a high probability event, which can affect the quality of subsequent processing and limit the scenarios of application. Hence, removing the thick cloud and cloud shadow as well as recovering the cloud-contaminated pixels is indispensable to make good use of remote sensing images. In this paper, a novel thick cloud removal method for remote sensing images based on temporal smoothness and sparsity-regularized tensor optimization (TSSTO) is proposed. The basic idea of TSSTO is that the thick cloud and cloud shadow are not only sparse but also smooth along the horizontal and vertical direction in images while the clean images are smooth along the temporal direction between images. Therefore, the sparsity norm is used to boost the sparsity of the cloud and cloud shadow, and unidirectional total variation (UTV) regularizers are applied to ensure the unidirectional smoothness. This paper utilizes alternation direction method of multipliers to solve the presented model and generate the cloud and cloud shadow element as well as the clean element. The cloud and cloud shadow element is purified to get the cloud area and cloud shadow area. Then, the clean area of the original cloud-contaminated images is replaced to the corresponding area of the clean element. Finally, the reference image is selected to reconstruct details of the cloud area and cloud shadow area using the information cloning method. A series of experiments are conducted both on simulated and real cloud-contaminated images from different sensors and with different resolutions, and the results demonstrate the potential of the proposed TSSTO method for removing cloud and cloud shadow from both qualitative and quantitative viewpoints

    TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series

    Full text link
    This work summarizes two strategies for completing time-series (TS) tasks using today's language model (LLM): LLM-for-TS, design and train a fundamental large model for TS data; TS-for-LLM, enable the pre-trained LLM to handle TS data. Considering the insufficient data accumulation, limited resources, and semantic context requirements, this work focuses on TS-for-LLM methods, where we aim to activate LLM's ability for TS data by designing a TS embedding method suitable for LLM. The proposed method is named TEST. It first tokenizes TS, builds an encoder to embed them by instance-wise, feature-wise, and text-prototype-aligned contrast, and then creates prompts to make LLM more open to embeddings, and finally implements TS tasks. Experiments are carried out on TS classification and forecasting tasks using 8 LLMs with different structures and sizes. Although its results cannot significantly outperform the current SOTA models customized for TS tasks, by treating LLM as the pattern machine, it can endow LLM's ability to process TS data without compromising the language ability. This paper is intended to serve as a foundational work that will inspire further research.Comment: 10 pages, 6 figure
    • …
    corecore